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adaptability of S proteins to orthologous receptors, alterations in the proteolytic cleav- 
age activation as well as changes in the S protein metastability. A thorough understand- 
ing of the key role of the S protein in CoV entry is critical to further our understanding of 
virus cross-species transmission and pathogenesis and for development of intervention 
Strategies. 


1. INTRODUCTION 


Coronaviruses (CoVs) (order Nidovirales, family Coronaviridae, sub- 
family Coronavirinae) are enveloped, positive-sense RNA viruses that con- 
tain the largest known RNA genomes with a length of up to 32 kb. The 
subfamily Coronavirinae, which contains viruses of both medical and 
veterinary importance, can be divided into the four genera alpha-, beta-, 
gamma- and deltacoronavirus (a-, B-, y- and 6-CoV). The coronavirus particle 
comprises at least the four canonical structural proteins E (envelope 
protein), M (membrane protein), N (nucleocapsid protein), and S (spike 
protein). In addition, viruses belonging to lineage A of the betacoronaviruses 
express the membrane-anchored HE (hemagglutinin—esterase) protein. The 
S glycoprotein contains both the receptor-binding domain (RBD) and the 
domains involved in fusion, rendering it the pivotal protein in the CoV 
entry process. 

Coronaviruses primarily infect the respiratory and gastrointestinal tract of 
a wide range of animal species including many mammals and birds. Although 
individual virus species mostly appear to be restricted to a narrow host range 
comprising a single animal species, genome sequencing and phylogenetic 
analyses testify that CoVs have crossed the host species barrier frequently 
(Chan et al., 2013; Woo et al., 2012). In fact most if not all human cor- 
onaviruses seem to originate from bat CoVs (BtCoVs) that transmitted to 
humans directly or indirectly through an intermediate host. It therefore 
appears inevitable that similar zoonotic infections will occur in the future. 

In the past 15 years, the world witnessed two such zoonotic events. In 
2002-2003 cross-species transmissions from bats and civet cats were at 
the base of the SARS (severe acute respiratory syndrome)-CoV epidemic 
that found its origin in the Chinese Guangdong province (Li et al., 2006; 
Song et al., 2005). The SARS-CoV nearly became a pandemic and led to 
over 700 deaths, before it disappeared when the appropriate hygiene and 
quarantine precautions were taken. In 2012, the MERS (Middle East respi- 
ratory syndrome)-CoV emerged in the human population on the Arabian 
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Peninsula and currently continues to make a serious impact on the local but 
also global health system with 1800 laboratory confirmed cases and 640 
deaths as of September 1, 2016 (WHO | Middle East respiratory 
syndrome coronavirus (MERS-CoV) — Saudi Arabia, 2016). The natural 
reservoir of MERS-CoV is presumed to be in dromedary camels from 
which zoonotic transmissions repeatedly give rise to infections of the lower 
respiratory tract in humans (Alagaili et al., 2014; Azhar et al., 2014; Briese 
et al., 2014; Reusken et al., 2013; Widagdo et al., 2016). Besides these two 
novel CoVs, four other CoVs were previously identified in humans which 
are found in either the alphacoronavirus (HCoV-NL63 and HCoV-229E) or 
the betacoronavirus genera (HCoV-OC43 and HCoV-HKU1). Phylogenetic 
analysis has shown that the bovine CoV (BCoV) has been the origin for 
HCoV-OC43 following a relatively recent cross-species transmission event 
(Vijgen et al., 2006). Moreover, HCoV-NL63, HCoV-229E, SARS-CoV, 
and MERS-CoV also have been predicted to originate from bats (Annan 
et al., 2013; Bolles et al., 2011; Corman et al., 2015; Hu et al., 2015; 
Huynh et al., 2012). 

In general, four major criteria determine cross-species transmission of a 
particular virus (Racaniello et al., 2015). The cellular tropism of a virus is 
determined by the susceptibility of host cells (i.e., presence of the receptor 
needed for entry) as well as by the permissiveness of these host cells to allow 
the virus to replicate and to complete its life cycle. A third determinant con- 
sists of the accessibility of susceptible and permissive cells in the host. Finally, 
the innate immune response may restrict viral replication in a host species- 
specific manner. The above-mentioned criteria may play a critical role in the 
success of a cross-species transmission event. However, for CoVs, it seems 
that host tropism and changes therein are particularly determined by the sus- 
ceptibility of host cells to infection. While CoV accessory genes, including 
the HE proteins, are thought to play a role in host tropism and adaptation to 
a new host, the S glycoprotein appears to be the main determinant for the 
success of initial cross-species infection events. In this review, we focus on 
the molecular changes in the S protein that underlie tropism changes at the 
cellular, tissue, and host species level and put these in perspective of the 
recently published cryo-EM structures. 


2. STRUCTURE OF THE CORONAVIRUS S PROTEIN 


The CoV S protein is a class I viral fusion protein (Bosch et al., 2003) 
similar to the fusion proteins of influenza, retro-, filo-, and paramyxoviruses 
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(Baker et al., 1999; Bartesaghi et al., 2013; Lee et al., 2008; Lin et al., 2014). 
Like other class I viral fusion proteins, the S protein folds into a metastable 
prefusion conformation following translation. The size of the abundantly 
N-glycosylated S protein varies greatly between CoV species ranging from 
approximately 1100 to 1600 residues in length, with an estimated molecular 
mass of up to 220 kDa. Trimers of the S protein form the 18—23-nm long, 
club-shaped spikes that decorate the membrane surface of the CoV particle. 
Besides being the primary determinant in CoV host tropism and pathogen- 
esis, the S protein is also the main target for neutralizing antibodies elicited 
by the immune system of the infected host (Hofmann et al., 2004). 

The S protein can be divided into two functionally distinct subunits: the 
globular S4 subunit is involved in receptor recognition, whereas the S> sub- 
unit facilitates membrane fusion and anchors S into the viral membrane 
(Fig. 1A). The S; and S domains may be separated by a cleavage site that 
is recognized by furin-like proteases during S protein biogenesis in the 
infected cell. X-ray crystal structures of several S domains have furthered 
our understanding of the S protein in the past. In addition, recent elucidation 
of the high-resolution structures of the spike ectodomain of two 
betacoronavirusee—MHV and HCoV-HKU1—by single-particle cryo- 
electron microscopy (Kirchdoerfer et al., 2016; Walls et al., 2016) has pro- 
vided novel insights into the architecture of the S trimer in its prefusion state 
(Fig. 1B and C). 


2.1 Structure of the S, Subunit 


The S4 subunit of the betacoronavirus spike proteins displays a multidomain 
architecture and is structurally organized in four distinct domains A-D of 
which domains A and B may serve as a RBD (Fig. 1C). The core structure 
of domain A displays a galectin-like B-sandwich fold, whereas domain 
B contains a structurally conserved core subdomain of antiparallel B-sheets 
(Kirchdoerfer et al., 2016; Li et al., 2005a; Walls et al., 2016; Wang et al., 
2013). Importantly, domain B is decorated with an extended loop on the 
viral membrane-distal side. This loop may differ greatly in size and structure 
between virus species of the betacoronavirus genus and is therefore also 
referred to as hypervariable region (HVR). The cryo-EM structures of 
the MHV-A59 and HCoV-HKU1 S trimers show an intricate interlocking 
of the three S4 subunits (Fig. 1B). Oligomerization of the S protomers results 
in a closely clustered trimer of the individual B domains close to the three- 
fold axis of the spike on top of the S trimer, whereas the three A domains are 
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Fig. 1 Spike protein features and structure of the mouse hepatitis coronavirus spike 
glycoprotein trimer. (A) Schematic linear representation of the coronavirus S protein 
with relevant domains/sites indicated: signal peptide (SP), two proteolytic cleavage 
sites (S4/S2 and S7’), two proposed fusion peptides (FP1 and FP2), two heptad repeat 
regions (HR1 and HR2), transmembrane domain (TD), and cytoplasmic tail (CT). 
(B) Front and top view of the trimeric mouse hepatitis coronavirus (strain A59) spike 
glycoprotein ectodomain obtained by cryo-electron microscopy analysis (Walls et al., 
2016; PDB: 3JCL). Three S; protomers (surface presentation) are colored in red, blue, 
and green. The S- trimer (cartoon presentation) is colored in light orange. (C) Schematic 
representation of MHV spike protein sequence (drawn to scale), the S4 domains A, 
B, C, and D are colored in blue, green, yellow, and orange, respectively, and the linker 
region connecting domains A and B in gray, the S region is colored in red, and the 
TM region is indicated as a black box. Red-shaded region indicates spike region that was 

(Continued) 


34 RJ.G. Hulswit et al. 


ordered more distally of the center. In contrast to domains A and B, the S4 
C-terminal domains C and D are made up of discontinuous parts of the pri- 
mary protein sequence and form f-sheet-rich structures directly adjacent to 
the So stalk core, while the separate S4 domains are interconnected by loops 
covering the S2 surface. Compared to the S> subunit, the S4 subunit displays 
low level of sequence conversation among species of different CoV genera. 
Moreover, S; subunits vary considerably in sequence length ranging from 
544 (infectious bronchitis virus (IBV) S) to 944 (229-related bat coro- 
navirus S) residues in length (Fig. 2), indicating differences in architecture 
of the spikes of species from different CoV genera. Structural information 
from the spikes of gamma- and deltacoronavirus species is currently lacking. 
Two independently folding domains have been assigned in the S; subunit 
of alphacoronavirus spikes, that can interact with host cell surface molecules, 
an N-terminal domain (in transmissible gastroenteritis virus (TGEV) S resi- 
dues 1-245) and a more C-terminal domain (in TGEV S residues 
506-655). Contrary to betacoronaviruses, these two receptor-interacting 
domains in alphacoronavirus spikes are separated in sequence by some 275 
residues, which may fold into one or more separate domains. Structural infor- 
mation is only available for the C-terminal S; RBD of two a-CoV S proteins, 
which differs notably from that of betacoronaviruses. The RBD in the S4 
CTR of alphacoronaviruses displays a B-sandwich core structure, whereas a 
B-sheet core structure is seen for betacoronaviruses (Reguera et al., 2012; 
Wu et al., 2009). 


2.2 Structure of the S, Subunit 


The highly conserved Sz subunit contains the key protein segments that 
facilitate virus-cell fusion. These include the fusion peptide, two heptad 


Fig. 1—Cont’d not resolved in the cryo-EM structure. (Lower panel) Two views on 
the structure of the mouse hepatitis virus spike glycoprotein protomer (cartoon repre- 
sentation); domains are colored as depicted earlier. (D) Comparison of the S2 HR1 region 
in its pre- and postfusion conformation. (Lower left) Structure of the MHV S» protomer 
(cartoon presentation) with four helices of the HR1 region (and consecutive linker 
region) and the downstream central helix colored in blue, green, yellow, orange, and 
red, respectively. (Upper right) The structure of a single SARS-CoV S HR1 helix of the post- 
fusion six-helix bundle structure (PDB: 1WYY) is colored according to the homologous 
HR1 region in the MHV S, prefusion structure shown in the lower left panel. Structures 
are aligned based on the N-terminal segment of the central helix (in red). Figures were 
generated with PyMOL. 
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Fig. 2 Overview of currently known receptors and their binding domains within S4. 
Schematic representation of coronavirus spike proteins drawn to scale. Yellow boxes 
indicate signal peptides. Blue boxes indicate the N-terminal regions in alpha- and 
betacoronavirus spike proteins, which were mapped based on sequence homology 
between viruses within the same genus. Green boxes indicate known receptor-binding 
domains in the C-terminal region of S4. Known receptors are indicated in the boxes: APN, 
aminopeptidase N; ACE2, angiotensin-converting enzyme 2; CEACAM, carcinoembryonic 
antigen-related cell adhesion molecule 1; Sia, sialic acid; O-ac Sia, O-acetylated sialic 
acid; DPP4, dipeptidyl peptidase-4. Gray boxes indicate transmembrane domains. Spikes 
proteins are shown of PEDV strain CV777 (GB: AAK38656.1), TGEV strain Purdue P115 
(GB: ABG89325.1), PRCoV strain ISU-1 (GB: ABG89317.1), Feline CoV strain UU23 (GB: 
ADC35472.1), Feline CoV strain UU21 (GB: ADL71466.1), Human CoV NL63 (GB: 
YP_003767.1), 229E-related bat CoV with one N domains (GB: ALK28775.1), 229E-related 
bat CoV with two N domains (GB: ALK28765.1), Human CoV 229E strain inf-1 (GB: 
NP_073551.1), MHV strain A59 (GB: ACO72893), BCoV strain KWD1 (GB: AAX38489), 
HCoV-OC43 strain Paris (GB: AAT84362), HCoV-HKU1 (GB: AAT98580), SARS-CoV strain 
Urbani (GB: AAP13441), MERS-CoV strain EMC/2012 (GB: YP_009047204), HKU4 (GB: 
AGP04928), HKU5 (GB: AGP04943), IBV strain Beaudette (GB: ADP06471), and PDCoV 
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repeat regions (HR1 and HR2) and the transmembrane domains which are 
well conserved among CoV species across different genera. In the MHV 
and HKU 1 S prefusion structures, the S2 domain consists of multiple «-helical 
segments and a three-stranded antiparallel B-sheet at the viral membrane- 
proximal end. A 75 A long central helix located immediately downstream 
of the HR1 region stretches along the threefold axis over the entire length 
of the Sz trimer. The HR1 motif itself folds as four individual a-helices along 
the length of the S, subunit, in contrast to the 120 A long a-helix formed by 
this region in postfusion structures (Duquerroy et al., 2005; Gao et al., 2013; 
Xu etal., 2004). A55A long helix upstream of the S?’ cleavage site runs parallel 
to and is packed against the central helix via hydrophobic interactions (Fig. 1C). 
The fusion peptide forms a short helix of which the strictly conserved hydro- 
phobic residues are buried in an interface with other elements of Sz. Unlike 
other class I fusion proteins, this conserved fusion peptide (FP1) is not directly 
upstream of HR1 but located some 65 residues upstream of this region 
(Fig. 1A). Intriguingly, a recent published report provided experimental evi- 
dence for the existence of another fusion peptide (FP2) immediately upstream 
of the HR1 region (Ou et al., 2016), that had been predicted earlier based on 
the position, hydrophobicity profile and amino acid composition canonical for 
class I viral fusion peptides (Bosch and Rottier, 2008; Bosch et al., 2004; 
Chambers et al., 1990). The HR2 region locates closely to the C-terminal 
end of the S ectodomain, but it appeared to be disordered in both cryo-EM 
structures and therefore its prefusion conformation remains unknown. 

The metastable prefusion conformation of S2 is locked by the cap formed 
by the intertwined S4 protomers. The distal tip of the S2 trimer connects via 
hydrophobic interactions with domains B. This distal tip of the S2 trimer 
consists of the C-terminal region of HR1 in the prefusion conformation, 
while the entire HR1 rearranges to form a central three-helix coiled coil 
in the postfusion structure (Duquerroy et al., 2005; Lu et al., 2014; 
Supekar et al., 2004). Interactions between this region of the S2 trimer 
and domain B may therefore prevent premature conformational changes 
resulting in the conversion of the prefusion S protein into the very stable 


Fig. 2—Cont’d strain USA/Ohi0137/2014 (GB: AIBO7807). PSI-BLAST analysis using the 
NTR of the HCoV-NL63 S protein (residues 16—196) as a query detected two homologous 
regions in the first 425 residues of the 229E-related bat coronavirus spike protein 
(GB: ALK28765.1)—designated N1 (residues 32—213) and N2 (residues 246—422) with 
32% and 35% amino acid sequence identity, respectively, suggesting a duplication 
of the NTR. Spike proteins are drawn to scale and aligned at the position of the con- 
served fusion peptide (FP1). 
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postfusion structure. Also domains C and D of the betacoronavirus S4 
subunit and the linker region connecting domain A and B interact with 
the surface of the adjacent S2 protomer and may hence play a role in stabi- 
lizing the prefusion S2 trimer. Domain A appears to play a minor role in 
this respect in view of its relatively small a surface area that interacts with 
the S» trimer. 


3. SPIKE—RECEPTOR INTERACTIONS 
3.1 Different Domains Within S, May Act as RBD 


Over the past decades, molecular studies on the CoV S glycoprotein have 
shown that both the N-terminal region (NTR, domain A in B-CoV) and 
the C-terminal region of S; (CTR, comprising domain B, C, and D in 
B-CoV) can bind host receptors and hence function as RBDs (Fig. 2) 
(Li, 2015). The CTR of alpha- and betacoronaviruses appears to bind 
proteinaceous receptors exclusively. The a-CoV HCoV-229E, serotype 
II feline CoV (FCoV), TGEV, and porcine respiratory coronavirus use 
the human aminopeptidase N (APN) of their respective hosts as recep- 
tors (Bonavia et al., 2003; Delmas et al., 1992; Reguera et al., 2012). 
The HCoV-NL63 (a-CoV) and SARS-CoV (B-CoV) both utilize 
angiotensin-converting enzyme 2 (ACE2) as a functional receptor (Li 
et al., 2005b; Wu et al., 2009), whereas the B-CoVs MERS-CoV and 
BtCoV-HKU¢4 recruit dipeptidyl peptidase-4 (DPP4) as a functional recep- 
tor (Lu et al., 2013; Mou et al., 2013; Raj et al., 2013; Wang et al., 2014; 
Yang et al., 2014). 

The receptor-binding motifs (RBMs) in the S, CTRs of alpha- and 
betacoronavirus spike proteins are presented on one or more loops exten- 
ding from the B-sheet core structure. Within alpha- and betacoronavirus 
genera the RBD core is structurally conserved yet the RBM(s) that deter- 
mine receptor specificity may vary extensively. For instance, the CTR 
of the a-CoVs PRCoV and HCoV-NL63 has a similar core structure 
suggesting common evolutionary origin but diverged in their RBMs rec- 
ruiting different receptors (APN and ACE2, respectively). A similar situa- 
tion is seen for the CTRs of B-CoVs SARS-CoV and MERS-CoV that 
bind ACE2 and DPP4, respectively (Li, 2015). Conversely, the CTRs of 
the a-CoV HCoV-NL63 and B-CoV SARS-CoV both recognize ACE2, 
yet via distinct molecular interactions (ACE2 recognition via three vs 
one RBM, respectively), which suggested a convergent evolution path- 
way for these viruses in recruiting the ACE2 receptor (Li, 2015). The core 
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structures of the CTRs in a- and B-CoVs provide a scaffold to present 
RBMs from extending loop(s), which may accommodate facile recep- 
tor switching by subtle alterations in or exchange of the RBMs via 
mutation/recombination. 

Contrary to the CTR, the NTR appears to mainly bind glycans. The 
NTR of the a-CoV TGEV and of the y-CoV IBV S proteins binds to sialic 
acids (Promkuntod et al., 2014; Schultze et al., 1996), while the NTR 
of betacoronaviruses including BCoV and HCoV-OC43 was shown to 
bind to O-acetylated sialic acids (Kunkel and Herrler, 1993; Peng 
et al., 2012; Schultze et al., 1991; Vlasak et al., 1988). Only the NTR of 
MHV (domain A) is known to interact with a protein receptor, being 
mCEACAM 1a (Peng et al., 2011), while lacking any detectable sialic acid- 
binding activity (Langereis et al., 2010). However, as the NTR of MHV 
displays the B-sandwich fold of the galectins, a family of sugar-binding pro- 
teins, it probably has evolved from a sugar-binding domain (Li, 2012). 

The presence of RBDs in different domains of the S protein that can bind 
either proteinaceous or glycan receptors illustrates a functional modularity of 
this glycoprotein in which different domains may fulfill the role of binding 
to cellular attachment or entry receptors. The CoV S protein is thought to 
have evolved from a more basic structure in which receptor recognition was 
confined to the CTR within S, (Li, 2015). The observed deletions of the 
NTR in some CoV species in nature are indicative ofa less stringent require- 
ment and integration of this domain with other regions of the spike trimer 
compared to the more C-terminally located domains of S4 and support a sce- 
nario in which the NTR has been acquired at a later time point in CoV evo- 
lutionary history. For example, the NTR of MHV, which displays a human 
galectin-like fold, was suggested to originate from a cellular lectin acquired 
early on in CoV evolution (Peng et al., 2011). Acquisition of glycan-binding 
domains and fusion thereof to the ancestral S protein may have resulted in a 
great extension of CoV host range and may have caused an increase in CoV 
diversity. The general preference of the NTR and CTR to bind to, respec- 
tively, glycan or protein receptors may be related to their arrangement in the 
S protein trimer. In contrast to the CTR, which is located in the center of 
the S trimer, the NTR is more distally oriented (Fig. 1B). As protein—glycan 
interactions are often of low affinity, the more distal orientation of domain 
A may allow multivalent receptor interactions, thereby increasing avidity. 
Interestingly, some CoVs appear to have a dual receptor usage as they 
may bind via their NTR and CTR to glycan and protein receptors, respec- 
tively (Fig. 2). 


Coronavirus Spike Protein and Tropism Changes 39 


3.2 CoV Protein Receptor Preference 


Although the number of currently known CoV receptors is limited, receptor 
usage does not appear to be necessarily conserved between closely related 
virus species such as HCoV-229E (APN) and HCoV-NL63 (ACE2), 
whereas identical receptors (ACE2) can be targeted by virus species from 
different genera such as HCoV-NL63 and SARS-CoV. It seems that CoVs 
prefer certain types of host proteins as their entry receptor, with three out of 
four of the so far identified proteinaceous receptors being ectopeptidases 
(APN, ACE2, and DPP4), although enzymatic activity of these proteins 
was shown not to be required for infection by their respective viruses 
(Bosch et al., 2014). Possibly, the localization to certain membrane micro- 
domains and efficient internalization of two of these proteins in polarized 
cells (APN and DPP4) may contribute to their suitability to function as entry 
receptors (Ait-Slimane et al., 2009). In the case of MERS-CoV, the region 
of DPP4 that is bound by the S protein coincides with the binding site for its 
physiological ligand adenosine deaminase (Raj et al., 2014). Employment of 
conserved epitopes such as these may also contribute to the cross-species 
transmission potential of viruses (Bosch et al., 2014), as is exemplified by 
MERS-CoV being able to use goat, camelid, cow, sheep, horse, pig, mon- 
key, marmoset, and human DPP4 as entry receptor (Barlan et al., 2014; 
Eckerle et al., 2014; Falzarano et al., 2014; Muller et al., 2012; van 
Doremalen et al., 2014). Similarly, this may apply for the ability of feline, 
canine, porcine, and human CoVs to use fAPN as entry receptor, at least 
in vitro (Tresnan et al., 1996). 


4. S PROTEIN PROTEOLYTIC CLEAVAGE AND 
CONFORMATIONAL CHANGES 


Coronavirus entry is a tightly regulated process that appears to be 
orchestrated by multiple triggers that include receptor binding and proteo- 
lytic processing of the S protein and that ultimately results in virus-cell 
fusion. It is initiated by virion attachment mediated through interaction 
of either the NTR or CTR (or both) in the S; subunit of the spike protein 
with host receptors. Upon attachment, the virus is taken up via receptor- 
mediated endocytosis by clathrin- or caveolin-dependent pathways 
(Burkard et al., 2014; Eifart et al., 2007; Inoue et al., 2007; Nomura 
et al., 2004) although other entry routes have also been reported (Wang 
et al., 2008). Prior to and/or during endocytic uptake the CoV S protein 
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is proteolytically processed. The spike protein may contain two proteolytic 
cleavage sites. One of the cleavage sites is located at the boundary between 
the S4 and S2 subunits (S;/S2 cleavage site), while the other cleavage site is 
located immediately upstream of the first fusion peptide (S,’ cleavage site). 
Although not irrevocably proven, it is expected that all CoVs depend on 
proteolytic cleavage on or close to S,’ for fusion to occur. Virus-cell fusion 
thus not only critically depends on the conformational changes following 
spike—receptor engagement, and perhaps on acidification of endosomal ves- 
icles (Eifart et al., 2007; Matsuyama and Taguchi, 2009; Zelus et al., 2003), 
but also on proteolytic activation of the S protein by proteases along the 
endocytic route (Burkard et al., 2014; Simmons et al., 2005). Indeed, inhi- 
bition of intracellular proteases has been shown to block virus entry and 
virus-cell fusion (Burkard et al., 2014; Frana et al., 1985; Simmons et al., 
2005; Yamada and Liu, 2009). The specific proteolytic cleavage require- 
ments of the S protein at the S/S boundary and particularly at the S?’ site 
may furthermore determine the intracellular site of fusion (Burkard et al., 
2014). In agreement herewith, it has become evident that the protease 
expression profile of host cells may form an additional determinant of the 
host cell tropism of coronaviruses (Millet and Whittaker, 2015). 

Analysis of the CoV S prefusion conformation suggests that relocation 
(or shedding) of the Sı subunits that cap the Sz subunit is a prerequisite 
for the conformational changes in S, that ultimately result in fusion. Shed- 
ding of S; probably requires receptor binding as well as proteolytic 
processing at S1/S2. The cryo-EM structure indicates that the $;/S2 proteo- 
lytic cleavage site is accessible to proteases prior to spike—receptor interac- 
tion, and depending on the particular cleavage site present may already be 
processed in the cell in which the virions are produced. As indicated earlier, 
the conformational changes in the S protein that result in virus-cell fusion 
most likely also require cleavage at the S?’ site immediately upstream of 
the fusion peptide. Interestingly, the S?’ cleavage site is located within an 
o-helix exposed on the prefusion S structure which prevents efficient pro- 
teolytic cleavage (Robertson et al., 2016). This indicates the necessity for 
preceding conformational changes induced by receptor binding and subse- 
quent shedding of S4, upon which the secondary structure of the S,' site 
transforms into a cleavable flexible loop. Following proteolytic cleavage 
activation at the Sz’ site, hydrophobic interactions between the fusion pep- 
tide and the adjacent S> helices are disturbed which allows the four a-helices 
and the connecting regions that make up the HR1 region in the prefusion 
S protein to refold into a long trimeric coiled coil (Fig. 1D). This coiled coil 
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forms an N-terminal extension of the central helix projecting the fusion 
peptide(s) toward the target membrane. Successively, the fusion peptide(s) 
will be inserted into the limiting membrane of the host cell endocytic com- 
partment. Next, as a consequence of Sp rearrangements, the two HR regions 
will interact to form an antiparallel energetically stable six-helix bundle 
(Bosch et al., 2003, 2004), enabling the close apposition and subsequent 
fusion of the viral and host lipid bilayers. 


5. TROPISM CHANGES ASSOCIATED WITH S PROTEIN 
MUTATIONS 


Changes in the S protein may result in an altered host, tissue, or cel- 
lular tropism of the virus. This is clearly exemplified by genomic recombi- 
nation events that result in exchange of (part of) the S protein and in a 
concomitant change in tropism. The propensity of CoVs to undergo homol- 
ogous genomic recombination has been exploited for the genetic manipu- 
lation of these viruses (de Haan et al., 2008; Haijema et al., 2003; Kuo et al., 
2000). To this end, interspecies chimeric coronaviruses were generated, 
which carried the spike ectodomain of another CoV and which could be 
selected based on their altered requirement for an entry receptor. Exchange 
of S protein genes may also occur in vivo, resulting in altered tropism as is 
illustrated by the occurrence of serotype I feline infectious peritonitis virus 
(FIPV). This virus results from a naturally occurring recombination event 
between feline and canine CoVs (CCoVs) in which the feline virus acquires 
a CCoV spike gene (Herrewegh et al., 1995; Terada et al., 2014). As a result 
of the acquisition of this new S protein, the rather harmless enteric feline 
CoV (FECV) turns into a systemically replicating and deadly FIPV. As 
FECV has a strict feline tropism (Myrrha et al., 2011), while CCoV has been 
shown to infect feline cells (Levis et al., 1995), it is likely that serotype II 
FIPVs arise in cats coinfected with serotype I FECV and CCoV. Further- 
more, as different recombination sites have been observed for each serotype 
II FIPV, while serotype II FECVs have not been observed, it appears that 
serotype II FIPVs exclusively result of reoccurring recombination events 
(Terada et al., 2014). In addition to these feline-CCoV recombinants, a chi- 
meric porcine coronavirus with a TGEV backbone and a spike of the por- 
cine epidemic diarrhea virus (PEDV) was recently isolated from swine fecal 
samples in Italy and Germany, likely also resulting from a recombination 
event (Akimkin et al., 2016; Boniotti et al., 2016). Moreover, the a-CoV 
HKU2 BtCoV probably resulted from genomic recombination as it encodes 
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an S protein that resembles a betacoronavirus S protein except for its N-ter- 
minal region that is similar to that of alphacoronaviruses (Lau et al., 2007). 
Thus, such genomic recombination events are not necessarily restricted to 
occur between viruses of the same genus. 


5.1 S, Receptor Interactions Determining Tropism 

5.1.1 Sı NTR Changes 

Several changes in the amino-terminal domain of S4 have been associated 
with changes in the tropism of the virus. For example, for several a-CoVs, 
loss of NTR of the S protein appears to be accompanied with a loss of 
enteric tropism. While the porcine CoV TGEV displays a tropism for both 
the gastrointestinal and respiratory tract, the closely related PRCoV, which 
lacks the sialic acid-binding N-terminal region (Krempl et al., 1997), only 
replicates in the respiratory tract. The loss of sialic acid-binding activity 
by four-amino acid changes in the NTR of its S protein resulted in an 
almost complete loss of enteric tropism (Krempl et al., 1997). Similar to 
TGEV, enteric serotype I FCoVs also have been reported to bind to sialic 
acids (Desmarets et al., 2014). Large deletions within the S, subunit 
corresponding to the N-terminal region have been found in variants of 
the systemically replicating FIPV (strains UU16, UU21, and C3663) after 
intrahost emergence from enteric FECV (Chang et al., 2012; Terada 
et al., 2012). Also FIPVs seem to have lost the ability to replicate in the 
enteric tract (Pedersen, 2014). Clinical isolates of human coronavirus 
229E as well as of the related alpaca coronavirus, both of which cause respi- 
ratory infections, encode relatively short spike proteins that lack the NTR 
(Crossley et al., 2012; Farsani et al., 2012). In contrast, closely related bat 
coronaviruses with intestinal tropism contain S proteins with a NTR or 
sometimes even two copies of the NTR (Corman et al., 2015) (Fig. 2). 
Overall, these observations suggest that the alphacoronavirus spike 
NTR—in particular its sialic acid-binding activity—may contribute to 
the enteric tropism of these alphacoronaviruses, while it is not required 
for replication in the respiratory tract or in other extraintestinal organs. It 
has been hypothesized that the sialic acid-binding activity of the spike pro- 
tein can allow virus binding to (i) soluble sialoglycoconjugates that may pro- 
tect the virus from hostile conditions in the stomach or (ii) to mucins that 
may prevent the loss of viruses by intestinal peristalsis and allow the virus to 
pass the thick mucus barrier, thereby gaining access to the intestinal cells to 
initiate infection (Schwegmann-Wessels et al., 2003). 
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Besides deletions of entire domains of the S protein, more subtle changes 
consisting of amino acid substitutions in S; NTR may also suffice to alter the 
virus’ tropism. For example, MHV variants have been observed that 
acquired the ability to use the human homologue of their murine 
CEACAM 1a receptor to enter cells as a result of mutations in their RBD 
that is located in S4 NTR (Baric et al., 1999). 


5.1.2 Sı CTR Changes 

As the CTR of the S4 subunit contains the protein RBD for most CoVs, also 
mutations in this part of S have been associated with changes in the virus’ 
tropism. Perhaps the most well-known example of viral cross-species trans- 
mission involves the SARS-CoV. Studies support a transmission model in 
which a SARS-like CoV was transmitted from Rhinolophus bats to palm 
civets, which subsequently transmitted the palm civet-adapted virus to 
humans at local food markets in southern China (Li et al., 2006). According 
to this model, SARS-like viruses adapted to both the palm civet and human 
host, which was reflected in the rapid viral evolution observed for these 
viruses within these species (Song et al., 2005). Two-amino acid substitu- 
tions within the RBD were elucidated that are of relevance for binding 
to the ACE2 proteins of palm civets and humans (Li et al., 2005b, 2006; 
Qu et al., 2005). From these studies it appears that due to strong conserva- 
tion of ACE2 between mammalian species only a few amino acid alterations 
within the RBD are needed to change coronavirus host species tropism. 
Indeed serial passage of SARS-CoVs in vitro or in vivo can rapidly lead 
to adaptation to new host species (Roberts et al., 2007). SARS-like viruses 
isolated from bats displayed major differences including a deletion in the 
ACE2 RBM compared to human SARS-CoV (Drexler et al., 2010; Ren 
et al., 2008) and as a consequence were unable of using human ACE2 as 
an entry receptor (Becker et al., 2008). However, recently a novel SARS- 
like BtCoV was identified, which could use ACE2 of Rhinolophus bats, palm 
civets as well as of humans as a functional receptor (Ge et al., 2013). These 
findings not only provide further evidence that bats are indeed the natural 
reservoir for SARS-like CoVs, but also that these bat coronaviruses can 
directly include human ACE2 in their receptor repertoire. The detection 
of sequences of SARS-CoV-like viruses in palm civets and raccoon dogs 
(Guan et al., 2003; Tu et al., 2004) therefore probably reflects the unusually 
wide host range of these viruses. A similar promiscuous receptor usage is also 
observed for MERS-CoV which binds to DPP4 of many species (Barlan 
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et al., 2014; Eckerle et al., 2014; Falzarano et al., 2014; Muller et al., 2012; 
van Doremalen et al., 2014) as indicated earlier. 

Just as SARS like and MERS-CoVs are able to use entry receptors of 
different host species, also several a-CoVs display promiscuity to ortho- 
logous receptors. For example, the feline APN molecule can be used as a 
receptor by feline (serotype II FIPV), canine (CCoV), porcine (TGEV), 
and human (HCoV-229E) a-CoVs in cell culture (Tresnan and Holmes, 
1998; Tresnan et al., 1996). Conversely, serotype II FIPV can only enter 
cells expressing feline APN (Tresnan and Holmes, 1998). The ability of 
TGEV and CCoV to use feline APN as a receptor probably results from 
strong conservation of the viral-binding motif (VBM) among APN 
orthologs in combination with the RBDs recognizing APN in a similar 
fashion (Reguera et al., 2012). Though recruiting the same receptor, 
HCoV-229E binds another domain within APN, which apparently is also 
conserved in feline APN (Kolb et al., 1997; Tusell et al., 2007). Conserva- 
tion of the VBM obviates the need for large adaptations within the RBD of 
these viruses to orthologous receptors allowing more facile cross-species 
transmission. 

Other mutations in the S$; CTR associated with altered tropism have 
been described for the B-CoV MHV. Similar to the humanized 
CEACAM 1a-recognizing MHV variant, serial passaging of virus-infected 
cells resulted in the selection of viruses with an extended host range, which 
were subsequently shown to be able to enter cells in a heparan sulfate- 
dependent and CEACAM 1a-independent manner (de Haan et al., 2005; 
Schickli et al., 1997). Two sets of mutations in the S protein were shown 
to be critically required for this phenotype, both of which resulted in the 
occurrence of multibasic heparan sulfate-binding sites. While one heparan 
sulfate-binding site was located in the S2 subunit immediately upstream of 
the fusion peptide, the other was located in the S, CTR. The presence 
of this latter, but not of the former, domain resulted in MHV that depended 
on both heparan sulfate and CEACAM 1a for entry. Additional introduction 
of the second heparan sulfate-binding site enabled the virus to become 
mCEACAM 1a independent (de Haan et al., 2006). In addition, a mutation 
of the HVR of S; may affect CoV tropism as was demonstrated for the MHV 
strain JHM (MHV-JHM). The spike protein of MHV-JHM may induce 
receptor-independent fusion (Gallagher et al., 1992, 1993). However, dele- 
tion of residues in HVR of MHV-JHM resulted in the spike protein being 
entirely dependent on CEACAM 1a binding for fusion (Dalziel et al., 1986; 
Gallagher and Buchmeier, 2001; Phillips and Weiss, 2001). 
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5.2 Changes in Proteolytic Cleavage Site and Other S, 
Mutations Associated with Altered Tropism 
5.2.1 Changes in Proteolytic Cleavage Sites 
Although the S2 subunit does not appear to contain any RBDs, several 
mutations in this subunit have been associated with changes in the virus’ tro- 
pism. Some of these changes affect the cleavage sites in the S protein that are 
located at the S1/S2 boundary or immediately upstream of the fusion peptide 
(S2 cleavage site). As these cleavages appear to be essential for virus-cell 
fusion, the availability of host proteases to process the S protein is of critical 
importance for the virus’ tropism. The importance of S protein cleavage at 
the S,/S, boundary for the tropism of the virus is exemplified by the BtCoV 
HKU4, which is closely related to the MERS-CoV. Although domain B of 
the HKU4 S protein can interact with both bat and human DPP4, it is only 
in the context of bat cells, but not human cells, that the virus can utilize these 
molecules as entry receptors (Yang et al., 2014). In contrast, MERS-CoV 
can enter cells of human and bat origin via both DPP4 orthologues. This 
difference results from host restriction factors at the level of proteolytic 
cleavage activation. Two-amino acid substitutions (S746R and N762A) 
in the S;/Sz boundary of the S protein were shown to be crucial for the 
adaptation of bat MERS-like CoV to the proteolytic environment of the 
human cells (Yang et al., 2015). 

Although probably not directly responsible for the tropism change asso- 
ciated with the enterically replicating FECV evolving into the systemically 
replicating FIPV, loss of a furin cleavage site at S4/S2 junction is observed in 
the majority of the FIPVs, whereas this furin cleavage site is strictly con- 
served in the parental FECV strains (Licitra et al., 2013). Apparently, con- 
servation of this furin cleavage site is not required for efficient systemic 
replication. However, as FIPV is generally not found in the feces of cats, 
it may well be that loss of the furin cleavage site at S;/S2—as well as muta- 
tions in other parts of the genome, such as the accessory genes—may prevent 
efficient replication of FIPV in the enteric tracts. 

Besides the influence of the S1/S2 cleavage site, virus tropism may also 
depend on the S,’ cleavage site upstream of FP1. In contrast to wild-type 
MHYV strain A59, a recombinant MHV carrying a furin cleavage site at this 
position was shown to no longer depend on lysosomal proteases for efficient 
entry to occur (Burkard et al., 2014). As a consequence, this virus was able to 
infect cells in which trafficking to lysosomes was inhibited. Cleavage at the 
So’ site may also be important for the tropism of PEDV, which causes major 
damage to the biofood industry in Asia and the Americas (Lee, 2015; 
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Song et al., 2015). PEDV replication in cell culture is strictly dependent 
on trypsin-like proteases, a requirement which is expected to limit its tro- 
pism in vivo to the enteric tract. The trypsin dependency of PEDV entry 
was shown, however, to be lifted after introduction of a furin cleavage 
site at the S?’ cleavage site by a single-amino acid substitution. Such muta- 
tions may potentially affect the spread of this virus in the pig by allowing it to 
replicate in nonenteric tissues in the absence of trypsin-like proteases 
(Li et al., 2015). 


5.2.2 Other S, Mutations Associated with Altered Tropism 

Mutations in other parts of the Sz subunit than those affecting the pro- 
teolytic cleavage sites may also influence the tropism of different CoVs. 
Several studies report a correlation between mutations in the HR1 region 
of FCoVs and the conversion of FECV into FIPV (Bank-Wolf et al., 
2014; Desmarets et al., 2016; Lewis et al., 2015). Such a correlation 
appeared even more convincing for mutations found in the recently 
identified FP2 (Chang et al., 2012; Ou et al., 2016). While these corre- 
lations suggest an important role for the S protein in the transition of 
FECV into FIPV, the causal relationship between these mutations in 
S and FIP remains to be determined. It is plausible, however, that such 
mutations may play a role in the acquired ability of FIPVs to infect mac- 
rophages. Indeed, for serotype II FCoV, the ability to replicate in 
macrophages was shown to be determined by residues located in the 
C-terminal part of the Sz subunit, although the responsible residues were 
not identified (Rottier et al., 2005). 

Also for other CoVs, mutations in the S> subunit have been linked to 
changes in the virus’ tropism. A serially passaged MHV-A59 virus was 
shown to obtain mutations (M936V, P939L, F948L, and S9491) in and 
adjacent to the HR1 region which conveyed host range expansion of the 
mutant virus to normally nonpermissive mammalian cell types in vitro 
(Baric et al., 1999; McRoy and Baric, 2008). Contrary, Krueger et al. 
reported three mutations in the S subunit of MHV-JHM (V870A located 
upstream of the S’ cleavage site and A994V and A1046V located in the 
HR1 region) all of which reduced the CEACAM1a-independent 
fusogenicity of this virus (Krueger et al., 2001). Many studies on MHV- 
JHM point to a crucial role of a leucine at amino acid position 1114 in 
S protein fusogenicity. The MHV S cryo-EM structure demonstrates 
that the L1114 residue is located in the central helix and contributes to inter- 
protomer interactions. A L1114F substitution in the MHV-JHM S protein 
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was observed in a mutant strain of JHM and correlated with an increased 
S1-S2 stability and the loss of the ability to induce CEACAM1a- 
independent fusion (Taguchi and Matsuyama, 2002), while a substitution 
of the same residue to an Arg (L1114R) reduced the neurotropism of this 
virus (Tsai et al., 2003). Mutants resistant to a monoclonal antibody 
(Wang et al., 1992) and soluble receptor (Saeki et al., 1997) also correlate 
with substitutions at this specific residue, illustrating the importance of this 
residue in S fusogenicity. For the MERS-CoV, mutations in HR1 have 
been identified that are thought to be associated with its adaptive evolution 
(Forni et al., 2015). Among these sites, position 1060 is particularly interest- 
ing, as it appears to correspond to substitutions found in MHV and IBV that 
modify the tropism of these viruses (MHV: E1035D; IBV: L857F; Navas- 
Martin et al., 2005; Yamada et al., 2009). Substitution E1035D in HR1 of 
MHV was shown to restore the hepatotropism of an otherwise non- 
hepatotropic MHV, the latter resulting from mutations in the S, NTR 
and the S/S, cleavage site. These studies collectively indicate that mutations 
in and close to the HR regions may affect CoV tropism, possibly by affecting 
the metastability and consequently fusogenicity of the S protein and/or the 
formation of the postfusion six-helix bundle. 


6. CONCLUDING REMARKS 


It appears that changes in the S protein associated with altered tropism 
can be found in several regions of the spike protein. These regions obviously 
include the NTR and CTR of S4 that are involved in the interaction with 
attachment and/or entry receptors. Substitutions within the Sı RBDs may 
convey an altered viral tropism by adaptation of the virus to new or 
orthologous entry receptors. In addition, the S protein cleavage sites are 
important for host tropism as the processing of these sites by host proteases 
will critically affect the removal of the S;-mediated locking of the S2 
prefusion conformation by shedding of S; (S4/S2 cleavage site) and the 
release of the fusion peptide(s) (S2’ cleavage site). Finally, changes in S> (par- 
ticularly in the HR regions) may compensate for yet suboptimal spike bind- 
ing to orthologous receptors by which low relative affinity interactions 
suffice to induce the required conformational changes of the S protein that 
ultimately result in the formation of the postfusion six-helix bundle and 
virus-cell fusion. 

The observation that the different domains of the S protein all contribute 
to the tropism of CoVs is indicative of a coordinated interplay between these 
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domains. This interplay has also been inferred from several studies, which 
reported changes in one S protein subunit often to be accompanied by adap- 
tations in the other subunit (Saeki et al., 1997; Wang et al., 1992). In addi- 
tion, the interplay between S4 and S2 has also been shown to be important 
for changes in the tropism of the virus as indicated earlier (de Haan et al., 
2006; Navas-Martin et al., 2005). The recently published cryo-EM struc- 
tures of CoV spike proteins (Kirchdoerfer et al., 2016; Walls et al., 2016) 
now provide structural evidence for the complex interplay between the sub- 
units and domains of the S protein. 

From all these studies, a picture arises in which the S protein is progres- 
sively destabilized through receptor engagement and proteolytic activation. 
In this process the S4 subunits serve as a safety pin that stabilizes the fusogenic 
S> trimer. The safety pin is discharged upon interaction with a specific recep- 
tor and processing by host cell proteases and thereby gives way to confor- 
mational changes of the instable S> subunit. Subsequent release of the 
fusion peptide may resemble the pulling of the trigger which inevitably 
results in fusion of viral and host membranes through interaction of the hep- 
tad repeats regions. 

Based on the presented data we propose a model in which the ability ofa 
CoV to cross the host species barrier is critically dependent on the interplay 
between the different regions of the S proteins. In this model, the probable 
low affinity of the S; RBD for a novel receptor must be compensated by 
sufficiently low Sp metastability, which depends on both proteolytic cleav- 
age of the S protein and the S2 interprotomer interactions. These required 
S protein characteristics may be generated during naturally occurring 
quasispecies variation and may result in the ability of the virus to replicate 
in and adapt to a new host. 
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